Skip to content

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Sep 24, 2025

A model deployment that is gracefully shutdown will wait until the queue up work is done (or timeout) before terminating the inference process. Waiting for the work to complete should not block a thread as it may be blocked for up to 5 minutes. The change here adds a call back to the worker queue to terminate the process once it has completed.

Follow on from #134673

# Conflicts:
#	x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/TrainedModelAssignmentNodeService.java
#	x-pack/plugin/ml/src/test/java/org/elasticsearch/xpack/ml/inference/assignment/TrainedModelAssignmentNodeServiceTests.java
@davidkyle davidkyle marked this pull request as ready for review September 30, 2025 09:05
@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Sep 30, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@davidkyle davidkyle enabled auto-merge (squash) September 30, 2025 09:18
@davidkyle davidkyle added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label Sep 30, 2025
@davidkyle davidkyle merged commit 6d2c3ef into elastic:main Oct 1, 2025
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cloud-deploy Publish cloud docker image for Cloud-First-Testing :ml Machine learning >refactoring Team:ML Meta label for the ML team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants